Data Analysis codeΒΆ

Note:

  • The following code will not work in its entirety since the whole original dataset is not provided. The .csv file s2n_data.csv is the version that you can follow along with. There is a not in the document where you can pick this up. Do import the modules below, otherwise the code will not work.
InΒ [1]:
# Module imports for the data analysis
import pandas as pd
import re
from RS_RTxReadBin import RTxReadBin
import optoanalysis as oa
from matplotlib import pyplot as plt
import numpy as np
from sklearn.preprocessing import StandardScaler
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis as LDA
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, classification_report, roc_curve, auc
from sklearn.preprocessing import LabelBinarizer
from tqdm import tqdm
import seaborn as sns
from sklearn.metrics import confusion_matrix, classification_report, roc_curve, auc
pyCUDA not present on system, function calc_fft_with_PyCUDA and calc_ifft_with_PyCUDA will crash
skcuda not present on system, function calc_fft_with_PyCUDA and calc_ifft_with_PyCUDA will crash
c:\Users\tjwil\anaconda3\envs\NATS3005\lib\site-packages\optoanalysis\thermo\thermo.py:71: NumbaDeprecationWarning: The 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details.
  def calc_hamiltonian(self, mass, omega_array):
c:\Users\tjwil\anaconda3\envs\NATS3005\lib\site-packages\optoanalysis\thermo\thermo.py:100: NumbaDeprecationWarning: The 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details.
  def calc_phase_space_density(self, mass, omega_array, temperature_array):
c:\Users\tjwil\anaconda3\envs\NATS3005\lib\site-packages\optoanalysis\thermo\thermo.py:129: NumbaDeprecationWarning: The 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details.
  def extract_thermodynamic_quantities(self,temperature_array):
c:\Users\tjwil\anaconda3\envs\NATS3005\lib\site-packages\optoanalysis\thermo\thermo.py:185: NumbaDeprecationWarning: The 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details.
  def calc_partition_function(mass, omega_array, temperature_array):
c:\Users\tjwil\anaconda3\envs\NATS3005\lib\site-packages\optoanalysis\thermo\thermo.py:210: NumbaDeprecationWarning: The 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details.
  def calc_entropy(phase_space_density_array):

Error RemovalΒΆ

InΒ [2]:
# Loading the master table data CSV file
data = pd.read_csv("master_table_data.csv")
InΒ [3]:
# Function to extract the number value from the optoanalysis data
def extract_value(input_string):
    input_string = str(input_string) # Changing the variable type to a string
    pattern = re.compile(r"\(?([-+]?\d*\.?\d+)(?:[eE]([-+]?\d+))?\+/-([-+]?\d*\.?\d+)(?:[eE]([-+]?\d+))?\)?")
    match = pattern.match(input_string) # Finding matches

    if not match: # If the match is not in the expected format
        return input_string # Return the input string      

    value = float(match.group(1))
    error = float(match.group(3))

    # Extract and apply orders of magnitude
    value_order = int(match.group(2)) if match.group(2) else 0
    error_order = int(match.group(4)) if match.group(4) else 0
    value *= 10 ** value_order
    error *= 10 ** error_order

    # Use manual scientific notation formatting for brackets
    if input_string.startswith('('):
        input_string
        match = re.search(r'e([+-]?\d+)', input_string)
        order_of_magnitude = int(match.group(1))
        value_sci = value * 10**order_of_magnitude
        return float(value_sci)
    else:
        return value
InΒ [4]:
# Cleaning the data for which the PSD Lorentzian curve was not a good fit
clean_data = data[data["Are the PSD peaks easy to fit?"] != "No"]
# Removing the columns Are the PSD peaks easy to fit? and Rough Peak frequencies, this is because
# optoanalysis gives a better peak frequency where the curve is fitted
clean_data = clean_data.drop(columns=["Are the PSD peaks easy to fit?", "f1 Rough Peak Frequency (kHz)",
                                      "f2 Rough Peak Frequency (kHz)", "f3 Rough Peak Frequency (kHz)"])
InΒ [5]:
# Applying the error removal function to get the no_err_data dataset
no_err_data = clean_data.applymap(extract_value)

Adding the signal-to-noise columnsΒΆ

InΒ [6]:
# Function to get the frequency and PSD from a R&S .bin file
def get_freq_PSD(file):
    y, time_data, save_info = RTxReadBin(file) # Reading the file
    voltages = y[:, 0, 0]
    time_start = save_info["XStart"]
    time_stop = save_info["XStop"]
    signal_record_length = save_info["SignalRecordLength"]

    time_step = (time_stop - time_start) / signal_record_length

    sample_frequency = 1 / time_step

    # Getting the data object
    data = oa.load_voltage_data(voltages, SampleFreq = sample_frequency, timeStart = time_start)
    freq, PSD = data.get_PSD() # Extracting the frequency and PSD
    return freq, PSD
InΒ [7]:
# A function to get the PSD values of the three frequency peaks and the noise floor
# This returns these values, the noise floor and the signal to noise value of each peak
def get_f1_f2_f3_nf(freq, PSD, f1_peak, f2_peak, f3_peak):
    f1_index = np.argmin(np.abs(freq - f1_peak))
    f2_index = np.argmin(np.abs(freq - f2_peak))
    f3_index = np.argmin(np.abs(freq - f3_peak))
    f1_PSD = PSD[f1_index]
    f2_PSD = PSD[f2_index]
    f3_PSD = PSD[f3_index]
    threshold = 400000
    threshold_idx = np.where(freq > threshold)[0]
    threshold_PSD = PSD[threshold_idx]
    noise_floor = np.mean(threshold_PSD)
    f1_S2N = f1_PSD/noise_floor
    f2_S2N = f2_PSD/noise_floor
    f3_S2N = f3_PSD/noise_floor
    return f1_PSD, f2_PSD, f3_PSD, noise_floor, f1_S2N, f2_S2N, f3_S2N
InΒ [8]:
# Empty lists to hold the valyues
f1_signal_list = []
f2_signal_list = []
f3_signal_list = []
noise_floor_list = []
f1_S2N_list = []
f2_S2N_list = []
f3_S2N_list = []

# Iterating through each row and appending to the lists
for index, row in no_err_data.iterrows():
    file = row["File Path"]
    freq, PSD = get_freq_PSD(file)
    f1_peak = float(row["f1 Peak Frequency (Hz)"])
    f2_peak = float(row["f2 Peak Frequency (Hz)"])
    f3_peak = float(row["f3 Peak Frequency (Hz)"])
    f1_PSD, f2_PSD, f3_PSD, noise_floor, f1_S2N, f2_S2N, f3_S2N = get_f1_f2_f3_nf(freq, PSD, f1_peak, f2_peak, f3_peak)
    f1_signal_list.append(f1_PSD)
    f2_signal_list.append(f2_PSD)
    f3_signal_list.append(f3_PSD)
    noise_floor_list.append(noise_floor)
    f1_S2N_list.append(f1_S2N)
    f2_S2N_list.append(f2_S2N)
    f3_S2N_list.append(f3_S2N)
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
InΒ [9]:
# Adding the lists to the pandas dataframe
no_err_data["f1 Signal"] = f1_signal_list
no_err_data["f2 Signal"] = f2_signal_list
no_err_data["f3 Signal"] = f3_signal_list
no_err_data["Noise Floor"] = noise_floor_list
no_err_data["f1 Signal-to-noise Value"] = f1_S2N_list
no_err_data["f2 Signal-to-noise Value"] = f2_S2N_list
no_err_data["f3 Signal-to-noise Value"] = f3_S2N_list
InΒ [10]:
# Dropping the columns for file path and density, these are not required from here on
s2n_data = no_err_data.drop(columns=["File Path", "Assumed Density (kg m^-3)"])

Remove outliersΒΆ

InΒ [11]:
# Examining the dataframe
s2n_data.head()
Out[11]:
Particle Type Pressure (mbar) f1 Peak Frequency (Hz) f2 Peak Frequency (Hz) f3 Peak Frequency (Hz) f1 Radius (m) f2 Radius (m) f3 Radius (m) f1 Mass (kg) f2 Mass (kg) ... f1 Conversion Factor (V m^-1) f2 Conversion Factor (V m^-1) f3 Conversion Factor (V m^-1) f1 Signal f2 Signal f3 Signal Noise Floor f1 Signal-to-noise Value f2 Signal-to-noise Value f3 Signal-to-noise Value
0 SiNP 3.54 134922.0 211863.0 239476.0 1.450000e-07 1.300000e-07 1.190000e-07 2.280000e-17 1.650000e-17 ... 200000.0 129000.0 44000.0 5.498862e-09 8.050425e-10 2.020248e-10 2.514141e-13 21871.732422 3202.058105 803.553955
1 SiNP 3.54 139785.0 213320.0 239286.0 1.420000e-07 1.280000e-07 6.000000e-08 2.180000e-17 1.590000e-17 ... 500000.0 255000.0 11900.0 2.838856e-08 2.763101e-09 7.373276e-11 3.328379e-13 85292.453125 8301.641602 221.527573
2 SiNP 3.58 139738.0 191300.0 226740.0 1.090000e-07 2.890000e-08 1.990000e-08 9.700000e-18 1.820000e-19 ... 179000.0 1480.0 760.0 2.965766e-09 3.955848e-12 2.025581e-12 2.581418e-13 11488.903320 15.324324 7.846777
4 SiNP 3.56 135483.0 202928.0 225030.0 1.790000e-07 1.590000e-07 5.030000e-08 4.300000e-17 3.040000e-17 ... 690000.0 350000.0 6200.0 3.389115e-08 4.139050e-09 2.588705e-11 3.991096e-13 84916.898438 10370.708984 64.862022
5 SiNP 3.53 136054.0 209189.0 234708.0 1.380000e-07 1.480000e-07 8.720000e-08 1.980000e-17 2.430000e-17 ... 274000.0 199000.0 17400.0 5.865347e-09 2.247756e-09 3.467187e-11 2.710740e-13 21637.441406 8292.039062 127.905571

5 rows Γ— 27 columns

InΒ [12]:
# Saving the data to a .csv file
s2n_data.to_csv("s2n_data.csv", index=False)

Pickup the code here using the provided s2n_data.csv file in the repository

InΒ [13]:
# Reading the data from the .csv file
s2n_data = pd.read_csv("s2n_data.csv")

# Function to remove outliers based on interquartile range
def remove_outliers(group):
    # Separates particle type from the numeric columns for outlier detection
    particle_type = group['Particle Type']
    numeric_data = group.drop('Particle Type', axis=1)
    
    # Calculate Q1 and Q3
    Q1 = numeric_data.quantile(0.25)
    Q3 = numeric_data.quantile(0.75)
    
    # Calculate IQR
    IQR = Q3 - Q1
    
    # Define the lower and upper bounds for each column
    lower_bound = Q1 - 1.5 * IQR
    upper_bound = Q3 + 1.5 * IQR
    
    # Filtering the data keeping only values within the bounds for all columns
    filtered_data = numeric_data[~((numeric_data < lower_bound) | (numeric_data > upper_bound)).any(axis=1)]
    
    # Reattach the particle type column to the filtered data
    filtered_data['Particle Type'] = particle_type[filtered_data.index]
    
    return filtered_data

# Group by particle type and apply the remove_outliers function for each group
no_outlier_data = s2n_data.groupby('Particle Type', group_keys=False).apply(remove_outliers)

# Resetting the index
no_outlier_data.reset_index(drop=True, inplace=True)

Subsetting the data into its various componentsΒΆ

InΒ [14]:
# Creating the dataset with only 25A and 25T at 1000uM concentration and SiNP
DNA_SiNPs_data = no_outlier_data.loc[no_outlier_data['Particle Type'].isin([
    '25A 1000uM ZnCl2 SiNP',
    '25T 1000uM ZnCl2 SiNP',
    'SiNP'
])]

# Saving the dataset to a .csv file
DNA_SiNPs_data.to_csv("DNA_SiNPs_data.csv", index=False)

# Creating the dataset with all of the 25T data at different ZnCl2 concentrations
T25_data = no_outlier_data.loc[no_outlier_data['Particle Type'].isin([
    '25T 100uM ZnCl2 SiNP',
    '25T 500uM ZnCl2 SiNP',
    '25T 750uM ZnCl2 SiNP',
    '25T 1000uM ZnCl2 SiNP'
])]

# Saving the dataset to a .csv file
T25_data.to_csv("T25_data.csv", index=False)

Defining the particle coloursΒΆ

InΒ [15]:
# Creating a dictionary for the particle colours for the LDA plots
particle_colours = {
    '25A 1000uM ZnCl2 SiNP': 'red',
    '25T 1000uM ZnCl2 SiNP': 'blue',
    'SiNP': 'gray',
    '1000uM ZnCl2 SiNP': 'green',
    '25T 100uM ZnCl2 SiNP': 'turquoise',
    '25T 500uM ZnCl2 SiNP': 'purple',
    '25T 750uM ZnCl2 SiNP': 'orange',
}

Linear Discriminant Analysis (LDA) PlotsΒΆ

InΒ [16]:
# Encoding particle type as a numeric label for comparison
DNA_SiNPs_data['Particle Type Label'], type_labels = pd.factorize(DNA_SiNPs_data['Particle Type'])

# Dropping particle type for clustering
X = DNA_SiNPs_data.drop(columns=["Particle Type", "Particle Type Label"])

# Standardising the data
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

# Applying LDA for dimensionality reduction
lda = LDA(n_components=2)
X_reduced = lda.fit_transform(X_scaled, DNA_SiNPs_data['Particle Type Label'])

# Plotting a single graph
fig, ax = plt.subplots(figsize=(8, 6), dpi=300)

# Plotting the true particles as a scatter plot
for particle_type, label in enumerate(type_labels):
    display_label = "Standard SiNP" if label == "SiNP" else label
    ax.scatter(
        X_reduced[DNA_SiNPs_data['Particle Type Label'] == particle_type, 0],
        X_reduced[DNA_SiNPs_data['Particle Type Label'] == particle_type, 1],
        label=display_label,
        alpha=0.7,
        color=particle_colours[label],
        edgecolors='black'
    )

# Creating handles for the particle type legend using circles for particles
particle_handles = [plt.Line2D([0], [0], marker='o', color='w',
                               markerfacecolor=particle_colours[label], markersize=10,
                               label=("Standard SiNP" if label == "SiNP" else label)
                                     .replace("ZnCl2", "ZnCl$_2$")
                                     .replace("uM", " ΞΌM")) for label in type_labels]

# Adding particle type legend
ax.legend(handles=particle_handles, title="Particle Type", loc="upper right")

# Axes labels
ax.set_xlabel("LDA Component 1")
ax.set_ylabel("LDA Component 2")

# Displaying the plot
plt.grid(False)
plt.tight_layout()
plt.show()
C:\Users\tjwil\AppData\Local\Temp\ipykernel_25096\3267812401.py:2: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  DNA_SiNPs_data['Particle Type Label'], type_labels = pd.factorize(DNA_SiNPs_data['Particle Type'])
No description has been provided for this image
InΒ [17]:
# Encoding particle type as a numeric label for comparison
T25_data['Particle Type Label'], type_labels = pd.factorize(T25_data['Particle Type'])

# Dropping particle type for clustering
X = T25_data.drop(columns=["Particle Type", "Particle Type Label"])

# Standardising the data
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

# Applying LDA for dimensionality reduction
lda = LDA(n_components=2)
X_reduced = lda.fit_transform(X_scaled, T25_data['Particle Type Label'])

# Plotting a single graph
fig, ax = plt.subplots(figsize=(8, 6), dpi=300)  # Single plot

# Plotting the true particles as a scatter plot
for particle_type, label in enumerate(type_labels):
    ax.scatter(
        X_reduced[T25_data['Particle Type Label'] == particle_type, 0],
        X_reduced[T25_data['Particle Type Label'] == particle_type, 1],
        label=f"{label}",
        alpha=0.7,
        color=particle_colours[label],
        edgecolors='black'
    )

# Creating handles for the particle type legend using circles for particles
particle_handles = [plt.Line2D([0], [0], marker='o', color='w', markerfacecolor=particle_colours[label], markersize=10, label=label.replace("ZnCl2", "ZnCl$_2$").replace("uM", " ΞΌM")) for label in type_labels]

# Adding particle type legend
ax.legend(handles=particle_handles, title="Particle Type", loc="lower left")

# Axes labels
ax.set_xlabel("LDA Component 1")
ax.set_ylabel("LDA Component 2")

# Displaying the plot
plt.grid(False)
plt.tight_layout()
plt.show()
C:\Users\tjwil\AppData\Local\Temp\ipykernel_25096\4127321902.py:2: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  T25_data['Particle Type Label'], type_labels = pd.factorize(T25_data['Particle Type'])
No description has been provided for this image

Random forest analysis and confusion matrix and ROC curvesΒΆ

DNA_SiNPs_dataΒΆ

InΒ [18]:
# Reading the DNA_SiNPs_data.csv
DNA_SiNPs_data = pd.read_csv("DNA_SiNPs_data.csv")

# Preprocessing the data by features and target variable
X = DNA_SiNPs_data.drop('Particle Type', axis=1)
y = DNA_SiNPs_data['Particle Type']

# Splitting the data into 80% training and 20% testing
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# The parameter grid to iterate through to find the best random forest model
param_grid = {
    'n_estimators': [50, 100, 200],
    'max_depth': [None, 10, 20, 30],
    'min_samples_split': [2, 5, 10],
    'min_samples_leaf': [1, 2, 4]
}

# Initial random forest training and feature importance evaluation
clf = RandomForestClassifier(random_state=42)
clf.fit(X_train, y_train)

# Getting feature importance scores
feature_importances = pd.Series(clf.feature_importances_, index=X.columns).sort_values(ascending=False)
top_features = feature_importances.index.tolist()

# Selecting top features and tuning hyperparameters
best_accuracy = 0
best_params = None
best_features = None

# Initialising the progress bar
total_iterations = len(top_features)
with tqdm(total=total_iterations, desc="Feature Subset Iterations", unit="features") as pbar:
    for num_features in range(1, len(top_features) + 1):
        selected_features = top_features[:num_features]
        X_train_subset = X_train[selected_features]
        X_test_subset = X_test[selected_features]

        # Performing GridSearchCV for hyperparameter tuning
        grid_search = GridSearchCV(estimator=RandomForestClassifier(random_state=42), param_grid=param_grid, cv=3, scoring='accuracy', n_jobs=-1)
        grid_search.fit(X_train_subset, y_train)

        # Evaluating the test set
        y_pred = grid_search.best_estimator_.predict(X_test_subset)
        accuracy = accuracy_score(y_test, y_pred)

        # Checking if this is the best model based upon accuracy score
        if accuracy > best_accuracy:
            best_accuracy = accuracy
            best_params = grid_search.best_params_
            best_features = selected_features

        # Updating the progress bar
        pbar.update(1)

# Training the best model and generate additional outputs
best_clf = RandomForestClassifier(**best_params, random_state=42)
best_clf.fit(X_train[best_features], y_train)
y_pred = best_clf.predict(X_test[best_features])
classification_rep = classification_report(y_test, y_pred)

# Saving feature importances and classification report
final_feature_importances = pd.Series(best_clf.feature_importances_, index=best_features)

# Binarizing the target labels for multiclass ROC curve
lb = LabelBinarizer()
y_train_bin = lb.fit_transform(y_train)
y_test_bin = lb.transform(y_test)

# Getting predicted probabilities for each class
y_prob = best_clf.predict_proba(X_test[best_features])

# Generating ROC variables and AUC for each class
fpr = {}
tpr = {}
roc_auc = {}

# Iterating through each class and plot the ROC curve
for i, particle_label in enumerate(lb.classes_):
    fpr[i], tpr[i], _ = roc_curve(y_test_bin[:, i], y_prob[:, i])
    roc_auc[i] = auc(fpr[i], tpr[i])
Feature Subset Iterations: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 26/26 [04:02<00:00,  9.32s/features]
InΒ [19]:
# Generating the confusion matrix
plt.figure(figsize=(6, 5), dpi=300)
cm = confusion_matrix(y_test, y_pred)

def format_label(label):
    # Replace exact match "SiNP" with "Standard SiNP"
    if label == "SiNP":
        label = "Standard SiNP"
    label = label.replace("SiNP", "SiNPs")
    label = label.replace("uM", " ΞΌM").replace("ZnCl2", "ZnCl$_2$")
    words = label.split()
    if len(words) > 1:
        mid_point = 3
        return ' '.join(words[:mid_point]) + '\n' + ' '.join(words[mid_point:])
    return label

formatted_labels = [format_label(label) for label in lb.classes_]

ax = sns.heatmap(cm, annot=True, fmt="d", cmap="Blues", 
                 xticklabels=formatted_labels, 
                 yticklabels=formatted_labels)
colorbar = ax.collections[0].colorbar
colorbar.set_label('Number of Test SiNPs')
plt.xlabel("Predicted")
plt.ylabel("Actual")
plt.title("Confusion Matrix")
plt.xticks(rotation=0)
plt.yticks(rotation=90)
plt.show()

# Outputting the classification report and ranked feature importance as a dataframe
report_dict = classification_report(y_test, y_pred, output_dict=True)
report_df = pd.DataFrame(report_dict).transpose()

# Calculating and ranking feature importance
final_feature_importances = pd.Series(best_clf.feature_importances_, index=best_features).sort_values(ascending=False)
feature_importance_df = pd.DataFrame({'Feature': final_feature_importances.index, 'Importance': final_feature_importances.values})

# Replacing "uM" with " ΞΌM" in feature names
feature_importance_df['Feature'] = feature_importance_df['Feature'].str.replace("uM", " ΞΌM").str.replace("ZnCl2", "ZnCl$_2$")

# Plotting feature importance
plt.figure(figsize=(12, 6), dpi=300)
final_feature_importances.plot(kind='barh', color='royalblue')
plt.xlabel("Feature Importance Score")
plt.ylabel("Feature")
plt.title("Feature Importance")
plt.show()

def format_ROC_label(label):
    if label == "SiNP":
        label = "Standard SiNP"
    label = label.replace("SiNP", "SiNPs")
    label = label.replace("uM", " ΞΌM").replace("ZnCl2", "ZnCl$_2$")
    return label

# Plotting ROC Curves as 3D plots
fig = plt.figure(figsize=(12, 8), dpi=300)
ax = fig.add_subplot(111, projection='3d')

# Colours for ROC curves
colors = ['red', 'blue', 'gray']

for i, particle_label in enumerate(lb.classes_):
    formatted_label = format_label(particle_label)
    ax.plot(fpr[i], tpr[i], zs=i, zdir='y', color=colors[i % len(colors)], label=f"{formatted_label} (AUC = {roc_auc[i]:.2f})")

# Adding diagonal dashed lines for chance score
chance_x = np.linspace(0, 1, 100)
chance_y = np.linspace(0, 1, 100)
for i in range(len(lb.classes_)):
    ax.plot(chance_x, chance_y, zs=i, zdir='y', linestyle='dashed', color='black', label='Chance' if i == 0 else "")

ax.set_xlabel("False Positive Rate")
ax.set_zlabel("True Positive Rate")
ax.set_title("ROC Curves for Each Particle Type")
ax.legend(loc="upper left")
ax.set_yticks([])
ax.set_box_aspect(None, zoom=0.9)
plt.show()

# Printing the best model results
print("Best Model Results")
print("===================")
print(f"Accuracy: {best_accuracy:.2f}")
print(f"Best Hyperparameters: {best_params}\n")
print(f"Best Features: {', '.join(best_features)}")

# Printing the detailed model results
print("Detailed Model Results\n")
print("=======================\n")
print(f"Accuracy: {best_accuracy:.2f}\n\n")
print("Classification Report:\n")
print(classification_rep)
print("\n\nFeature Importances:\n")
print(final_feature_importances.sort_values(ascending=False).to_string())
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
Best Model Results
===================
Accuracy: 1.00
Best Hyperparameters: {'max_depth': None, 'min_samples_leaf': 2, 'min_samples_split': 10, 'n_estimators': 50}

Best Features: f2 A Parameter, f1 A Parameter, f1 Signal-to-noise Value, f3 A Parameter, f1 Conversion Factor (V m^-1), f2 Peak Frequency (Hz), f1 Peak Frequency (Hz), f3 Conversion Factor (V m^-1), f2 Signal-to-noise Value
Detailed Model Results

=======================

Accuracy: 1.00


Classification Report:

                       precision    recall  f1-score   support

25A 1000uM ZnCl2 SiNP       1.00      1.00      1.00         4
25T 1000uM ZnCl2 SiNP       1.00      1.00      1.00         2
                 SiNP       1.00      1.00      1.00         7

             accuracy                           1.00        13
            macro avg       1.00      1.00      1.00        13
         weighted avg       1.00      1.00      1.00        13



Feature Importances:

f2 A Parameter                   0.193572
f1 A Parameter                   0.166313
f1 Signal-to-noise Value         0.159658
f1 Peak Frequency (Hz)           0.152474
f3 A Parameter                   0.078032
f2 Peak Frequency (Hz)           0.076259
f2 Signal-to-noise Value         0.067086
f1 Conversion Factor (V m^-1)    0.058037
f3 Conversion Factor (V m^-1)    0.048569

T25_dataΒΆ

InΒ [20]:
# Reading the T25_data.csv
T25_data = pd.read_csv("T25_data.csv")

# Preprocessing the data by features and target variable
X = T25_data.drop('Particle Type', axis=1)
y = T25_data['Particle Type']

# Splitting the data into 80% training and 20% testing
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# The parameter grid to iterate through to find the best random forest model
param_grid = {
    'n_estimators': [50, 100, 200],
    'max_depth': [None, 10, 20, 30],
    'min_samples_split': [2, 5, 10],
    'min_samples_leaf': [1, 2, 4]
}

# Initial random forest training and feature importance evaluation
clf = RandomForestClassifier(random_state=42)
clf.fit(X_train, y_train)

# Getting feature importance scores
feature_importances = pd.Series(clf.feature_importances_, index=X.columns).sort_values(ascending=False)
top_features = feature_importances.index.tolist()

# Selecting top features and tuning hyperparameters
best_accuracy = 0
best_params = None
best_features = None

# Initialising the progress bar
total_iterations = len(top_features)
with tqdm(total=total_iterations, desc="Feature Subset Iterations", unit="features") as pbar:
    for num_features in range(1, len(top_features) + 1):
        selected_features = top_features[:num_features]
        X_train_subset = X_train[selected_features]
        X_test_subset = X_test[selected_features]

        # Performing GridSearchCV for hyperparameter tuning
        grid_search = GridSearchCV(estimator=RandomForestClassifier(random_state=42), param_grid=param_grid, cv=3, scoring='accuracy', n_jobs=-1)
        grid_search.fit(X_train_subset, y_train)

        # Evaluating the test set
        y_pred = grid_search.best_estimator_.predict(X_test_subset)
        accuracy = accuracy_score(y_test, y_pred)

        # Checking if this is the best model based upon accuracy score
        if accuracy > best_accuracy:
            best_accuracy = accuracy
            best_params = grid_search.best_params_
            best_features = selected_features

        # Updating the progress bar
        pbar.update(1)

# Training the best model and generate additional outputs
best_clf = RandomForestClassifier(**best_params, random_state=42)
best_clf.fit(X_train[best_features], y_train)
y_pred = best_clf.predict(X_test[best_features])
classification_rep = classification_report(y_test, y_pred)

# Saving feature importances and classification report
final_feature_importances = pd.Series(best_clf.feature_importances_, index=best_features)

# Binarizing the target labels for multiclass ROC curve
lb = LabelBinarizer()
y_train_bin = lb.fit_transform(y_train)
y_test_bin = lb.transform(y_test)

# Getting predicted probabilities for each class
y_prob = best_clf.predict_proba(X_test[best_features])

# Generating ROC variables and AUC for each class
fpr = {}
tpr = {}
roc_auc = {}

# Iterating through each class and plot the ROC curve
for i, particle_label in enumerate(lb.classes_):
    fpr[i], tpr[i], _ = roc_curve(y_test_bin[:, i], y_prob[:, i])
    roc_auc[i] = auc(fpr[i], tpr[i])
Feature Subset Iterations: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 26/26 [04:09<00:00,  9.60s/features]
InΒ [21]:
# Generating the confusion matrix
plt.figure(figsize=(6, 5), dpi=300)
cm = confusion_matrix(y_test, y_pred)

# Formatting the labels with a space halfway through multi-word labels
def format_label(label):
    label = label.replace("uM", " ΞΌM").replace("ZnCl2", "ZnCl$_2$")
    label = label.replace("SiNP", "SiNPs")
    words = label.split()
    if len(words) > 1:
        mid_point = 3
        return ' '.join(words[:mid_point]) + '\n' + ' '.join(words[mid_point:])
    return label

formatted_labels = [format_label(label) for label in lb.classes_]

ax = sns.heatmap(cm, annot=True, fmt="d", cmap="Blues", 
                 xticklabels=formatted_labels, 
                 yticklabels=formatted_labels)
colorbar = ax.collections[0].colorbar
colorbar.set_label('Number of Test SiNPs')
max_val = np.max(cm)
colorbar.set_ticks(np.arange(0, max_val + 1, 1))
plt.xlabel("Predicted")
plt.ylabel("Actual")
plt.title("Confusion Matrix")
plt.xticks(rotation=0)
plt.yticks(rotation=90)
plt.show()

# Outputting the classification report and ranked feature importance as a dataframe
report_dict = classification_report(y_test, y_pred, output_dict=True)
report_df = pd.DataFrame(report_dict).transpose()

# Calculating and ranking feature importance
final_feature_importances = pd.Series(best_clf.feature_importances_, index=best_features).sort_values(ascending=False)
feature_importance_df = pd.DataFrame({'Feature': final_feature_importances.index, 'Importance': final_feature_importances.values})

# Replacing "uM" with " ΞΌM" in feature names
feature_importance_df['Feature'] = feature_importance_df['Feature'].str.replace("uM", " ΞΌM").str.replace("ZnCl2", "ZnCl$_2$")

# Plotting feature importance
plt.figure(figsize=(12, 6), dpi=300)
final_feature_importances.plot(kind='barh', color='royalblue')
plt.xlabel("Feature Importance Score")
plt.ylabel("Feature")
plt.title("Feature Importance")
plt.show()

# Plotting ROC Curves as 3D plots
fig = plt.figure(figsize=(12, 8), dpi=300)
ax = fig.add_subplot(111, projection='3d')

# Colours for ROC curves
colors = ['blue', 'turquoise', 'purple', 'orange']  # Extend if more classes exist

for i, particle_label in enumerate(lb.classes_):
    formatted_label = particle_label.replace("uM", " ΞΌM").replace("ZnCl2", "ZnCl$_2$").replace("SiNP", "SiNPs")
    ax.plot(fpr[i], tpr[i], zs=i, zdir='y', color=colors[i % len(colors)], label=f"{formatted_label} (AUC = {roc_auc[i]:.2f})")

# Adding diagonal dashed lines for chance score
chance_x = np.linspace(0, 1, 100)
chance_y = np.linspace(0, 1, 100)
for i in range(len(lb.classes_)):
    ax.plot(chance_x, chance_y, zs=i, zdir='y', linestyle='dashed', color='black', label='Chance' if i == 0 else "")

ax.set_xlabel("False Positive Rate")
ax.set_zlabel("True Positive Rate")
ax.set_title("ROC Curves for Each Particle Type")
ax.legend(loc="upper left")
ax.set_yticks([])
ax.set_box_aspect(None, zoom=0.9)
plt.show()

# Printing the best model results
print("Best Model Results")
print("===================")
print(f"Accuracy: {best_accuracy:.2f}")
print(f"Best Hyperparameters: {best_params}\n")
print(f"Best Features: {', '.join(best_features)}")

# Printing the detailed model results
print("Detailed Model Results\n")
print("=======================\n")
print(f"Accuracy: {best_accuracy:.2f}\n\n")
print("Classification Report:\n")
print(classification_rep)
print("\n\nFeature Importances:\n")
print(final_feature_importances.sort_values(ascending=False).to_string())
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
Best Model Results
===================
Accuracy: 0.71
Best Hyperparameters: {'max_depth': None, 'min_samples_leaf': 1, 'min_samples_split': 5, 'n_estimators': 50}

Best Features: f3 A Parameter, f2 A Parameter, f2 Peak Frequency (Hz), f1 A Parameter, f3 Peak Frequency (Hz)
Detailed Model Results

=======================

Accuracy: 0.71


Classification Report:

                       precision    recall  f1-score   support

25T 1000uM ZnCl2 SiNP       1.00      0.80      0.89         5
 25T 100uM ZnCl2 SiNP       0.50      1.00      0.67         2
 25T 500uM ZnCl2 SiNP       0.50      0.67      0.57         3
 25T 750uM ZnCl2 SiNP       1.00      0.50      0.67         4

             accuracy                           0.71        14
            macro avg       0.75      0.74      0.70        14
         weighted avg       0.82      0.71      0.73        14



Feature Importances:

f2 A Parameter            0.230730
f3 A Parameter            0.229042
f3 Peak Frequency (Hz)    0.219655
f1 A Parameter            0.174970
f2 Peak Frequency (Hz)    0.145603

Repeating the data analysis with outliers includedΒΆ

InΒ [22]:
# Creating the dataset with only 25A and 25T at 1000uM concentration and SiNP
DNA_SiNPs_data_with_outliers = s2n_data.loc[s2n_data['Particle Type'].isin([
    '25A 1000uM ZnCl2 SiNP',
    '25T 1000uM ZnCl2 SiNP',
    'SiNP'
])]

# Saving the dataset to a .csv file
DNA_SiNPs_data_with_outliers.to_csv("DNA_SiNPs_data_with_outliers.csv", index=False)

# Creating the dataset with all of the 25T data at different ZnCl2 concentrations
T25_data_with_outliers = s2n_data.loc[s2n_data['Particle Type'].isin([
    '25T 100uM ZnCl2 SiNP',
    '25T 500uM ZnCl2 SiNP',
    '25T 750uM ZnCl2 SiNP',
    '25T 1000uM ZnCl2 SiNP'
])]

# Saving the dataset to a .csv file
T25_data_with_outliers.to_csv("T25_data_with_outliers.csv", index=False)

Linear Discriminant Analysis (LDA) Plots with OutliersΒΆ

InΒ [23]:
# Encoding particle type as a numeric label for comparison
DNA_SiNPs_data_with_outliers['Particle Type Label'], type_labels = pd.factorize(DNA_SiNPs_data_with_outliers['Particle Type'])

# Dropping particle type for clustering
X = DNA_SiNPs_data_with_outliers.drop(columns=["Particle Type", "Particle Type Label"])

# Standardising the data
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

# Applying LDA for dimensionality reduction
lda = LDA(n_components=2)
X_reduced = lda.fit_transform(X_scaled, DNA_SiNPs_data_with_outliers['Particle Type Label'])

# Plotting a single graph
fig, ax = plt.subplots(figsize=(8, 6), dpi=300)

# Plotting the true particles as a scatter plot
for particle_type, label in enumerate(type_labels):
    display_label = "Standard SiNP" if label == "SiNP" else label
    ax.scatter(
        X_reduced[DNA_SiNPs_data_with_outliers['Particle Type Label'] == particle_type, 0],
        X_reduced[DNA_SiNPs_data_with_outliers['Particle Type Label'] == particle_type, 1],
        label=display_label,
        alpha=0.7,
        color=particle_colours[label],
        edgecolors='black'
    )

# Creating handles for the particle type legend using circles for particles
particle_handles = [plt.Line2D([0], [0], marker='o', color='w',
                               markerfacecolor=particle_colours[label], markersize=10,
                               label=("Standard SiNP" if label == "SiNP" else label)
                                     .replace("ZnCl2", "ZnCl$_2$")
                                     .replace("uM", " ΞΌM")) for label in type_labels]
# Adding particle type legend
ax.legend(handles=particle_handles, title="Particle Type", loc="lower left")

# Axes labels
ax.set_xlabel("LDA Component 1")
ax.set_ylabel("LDA Component 2")

# Displaying the plot
plt.grid(False)
plt.tight_layout()
plt.show()
C:\Users\tjwil\AppData\Local\Temp\ipykernel_25096\2430924205.py:2: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  DNA_SiNPs_data_with_outliers['Particle Type Label'], type_labels = pd.factorize(DNA_SiNPs_data_with_outliers['Particle Type'])
No description has been provided for this image
InΒ [24]:
# Encoding particle type as a numeric label for comparison
T25_data_with_outliers['Particle Type Label'], type_labels = pd.factorize(T25_data_with_outliers['Particle Type'])

# Dropping particle type for clustering
X = T25_data_with_outliers.drop(columns=["Particle Type", "Particle Type Label"])

# Standardising the data
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

# Applying LDA for dimensionality reduction
lda = LDA(n_components=2)
X_reduced = lda.fit_transform(X_scaled, T25_data_with_outliers['Particle Type Label'])

# Plotting a single graph
fig, ax = plt.subplots(figsize=(8, 6), dpi=300)  # Single plot

# Plotting the true particles as a scatter plot
for particle_type, label in enumerate(type_labels):
    ax.scatter(
        X_reduced[T25_data_with_outliers['Particle Type Label'] == particle_type, 0],
        X_reduced[T25_data_with_outliers['Particle Type Label'] == particle_type, 1],
        label=f"{label}",
        alpha=0.7,
        color=particle_colours[label],
        edgecolors='black'
    )

# Creating handles for the particle type legend using circles for particles
particle_handles = [plt.Line2D([0], [0], marker='o', color='w', markerfacecolor=particle_colours[label], markersize=10, label=label.replace("ZnCl2", "ZnCl$_2$").replace("uM", " ΞΌM")) for label in type_labels]

# Adding particle type legend
ax.legend(handles=particle_handles, title="Particle Type", loc="lower left")

# Axes labels
ax.set_xlabel("LDA Component 1")
ax.set_ylabel("LDA Component 2")

# Displaying the plot
plt.grid(False)
plt.tight_layout()
plt.show()
C:\Users\tjwil\AppData\Local\Temp\ipykernel_25096\3256748307.py:2: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  T25_data_with_outliers['Particle Type Label'], type_labels = pd.factorize(T25_data_with_outliers['Particle Type'])
No description has been provided for this image

Random forest analysis and confusion matrix and ROC curves with outliersΒΆ

DNA_SiNPs_dataΒΆ

InΒ [25]:
# Reading the DNA_SiNPs_data.csv
DNA_SiNPs_data_with_outliers = pd.read_csv("DNA_SiNPs_data_with_outliers.csv")

# Preprocessing the data by features and target variable
X = DNA_SiNPs_data_with_outliers.drop('Particle Type', axis=1)
y = DNA_SiNPs_data_with_outliers['Particle Type']

# Splitting the data into 80% training and 20% testing
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# The parameter grid to iterate through to find the best random forest model
param_grid = {
    'n_estimators': [50, 100, 200],
    'max_depth': [None, 10, 20, 30],
    'min_samples_split': [2, 5, 10],
    'min_samples_leaf': [1, 2, 4]
}

# Initial random forest training and feature importance evaluation
clf = RandomForestClassifier(random_state=42)
clf.fit(X_train, y_train)

# Getting feature importance scores
feature_importances = pd.Series(clf.feature_importances_, index=X.columns).sort_values(ascending=False)
top_features = feature_importances.index.tolist()

# Selecting top features and tuning hyperparameters
best_accuracy = 0
best_params = None
best_features = None

# Initialising the progress bar
total_iterations = len(top_features)
with tqdm(total=total_iterations, desc="Feature Subset Iterations", unit="features") as pbar:
    for num_features in range(1, len(top_features) + 1):
        selected_features = top_features[:num_features]
        X_train_subset = X_train[selected_features]
        X_test_subset = X_test[selected_features]

        # Performing GridSearchCV for hyperparameter tuning
        grid_search = GridSearchCV(estimator=RandomForestClassifier(random_state=42), param_grid=param_grid, cv=3, scoring='accuracy', n_jobs=-1)
        grid_search.fit(X_train_subset, y_train)

        # Evaluating the test set
        y_pred = grid_search.best_estimator_.predict(X_test_subset)
        accuracy = accuracy_score(y_test, y_pred)

        # Checking if this is the best model based upon accuracy score
        if accuracy > best_accuracy:
            best_accuracy = accuracy
            best_params = grid_search.best_params_
            best_features = selected_features

        # Updating the progress bar
        pbar.update(1)

# Training the best model and generate additional outputs
best_clf = RandomForestClassifier(**best_params, random_state=42)
best_clf.fit(X_train[best_features], y_train)
y_pred = best_clf.predict(X_test[best_features])
classification_rep = classification_report(y_test, y_pred)

# Saving feature importances and classification report
final_feature_importances = pd.Series(best_clf.feature_importances_, index=best_features)

# Binarizing the target labels for multiclass ROC curve
lb = LabelBinarizer()
y_train_bin = lb.fit_transform(y_train)
y_test_bin = lb.transform(y_test)

# Getting predicted probabilities for each class
y_prob = best_clf.predict_proba(X_test[best_features])

# Generating ROC variables and AUC for each class
fpr = {}
tpr = {}
roc_auc = {}

# Iterating through each class and plot the ROC curve
for i, particle_label in enumerate(lb.classes_):
    fpr[i], tpr[i], _ = roc_curve(y_test_bin[:, i], y_prob[:, i])
    roc_auc[i] = auc(fpr[i], tpr[i])
Feature Subset Iterations: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 26/26 [06:05<00:00, 14.05s/features]
InΒ [26]:
# Generating the confusion matrix
plt.figure(figsize=(6, 5), dpi=300)
cm = confusion_matrix(y_test, y_pred)

# Formatting the labels with a space halfway through multi-word labels
def format_label(label):
    if label == "SiNP":
        label = "Standard SiNP"
    label = label.replace("SiNP", "SiNPs")
    label = label.replace("uM", " ΞΌM").replace("ZnCl2", "ZnCl$_2$")
    words = label.split()
    if len(words) > 1:
        mid_point = 3
        return ' '.join(words[:mid_point]) + '\n' + ' '.join(words[mid_point:])
    return label

formatted_labels = [format_label(label) for label in lb.classes_]

ax = sns.heatmap(cm, annot=True, fmt="d", cmap="Blues", 
                 xticklabels=formatted_labels, 
                 yticklabels=formatted_labels)
colorbar = ax.collections[0].colorbar
colorbar.set_label('Number of Test SiNPs')
max_val = np.max(cm)
colorbar.set_ticks(np.arange(0, max_val + 1, 1))
plt.xlabel("Predicted")
plt.ylabel("Actual")
plt.title("Confusion Matrix")
plt.xticks(rotation=0)
plt.yticks(rotation=90)
plt.show()

# Outputting the classification report and ranked feature importance as a dataframe
report_dict = classification_report(y_test, y_pred, output_dict=True)
report_df = pd.DataFrame(report_dict).transpose()

# Calculating and ranking feature importance
final_feature_importances = pd.Series(best_clf.feature_importances_, index=best_features).sort_values(ascending=False)
feature_importance_df = pd.DataFrame({'Feature': final_feature_importances.index, 'Importance': final_feature_importances.values})

# Replacing "uM" with " ΞΌM" in feature names
feature_importance_df['Feature'] = feature_importance_df['Feature'].str.replace("uM", " ΞΌM").str.replace("ZnCl2", "ZnCl$_2$")

# Plotting feature importance
plt.figure(figsize=(12, 6), dpi=300)
final_feature_importances.plot(kind='barh', color='royalblue')
plt.xlabel("Feature Importance Score")
plt.ylabel("Feature")
plt.title("Feature Importance")
plt.show()

def format_ROC_label(label):
    if label == "SiNP":
        label = "Standard SiNP"
    label = label.replace("SiNP", "SiNPs")
    label = label.replace("uM", " ΞΌM").replace("ZnCl2", "ZnCl$_2$")
    return label

# Plotting ROC Curves as 3D plots
fig = plt.figure(figsize=(12, 8), dpi=300)
ax = fig.add_subplot(111, projection='3d')

# Colours for ROC curves
colors = ['red', 'blue', 'gray']

for i, particle_label in enumerate(lb.classes_):
    formatted_label = format_label(particle_label)
    ax.plot(fpr[i], tpr[i], zs=i, zdir='y', color=colors[i % len(colors)], label=f"{formatted_label} (AUC = {roc_auc[i]:.2f})")

# Adding diagonal dashed lines for chance score
chance_x = np.linspace(0, 1, 100)
chance_y = np.linspace(0, 1, 100)
for i in range(len(lb.classes_)):
    ax.plot(chance_x, chance_y, zs=i, zdir='y', linestyle='dashed', color='black', label='Chance' if i == 0 else "")

ax.set_xlabel("False Positive Rate")
ax.set_zlabel("True Positive Rate")
ax.set_title("ROC Curves for Each Particle Type")
ax.legend(loc="upper left")
ax.set_yticks([])
ax.set_box_aspect(None, zoom=0.9)
plt.show()

# Printing the best model results
print("Best Model Results")
print("===================")
print(f"Accuracy: {best_accuracy:.2f}")
print(f"Best Hyperparameters: {best_params}\n")
print(f"Best Features: {', '.join(best_features)}")

# Printing the detailed model results
print("Detailed Model Results\n")
print("=======================\n")
print(f"Accuracy: {best_accuracy:.2f}\n\n")
print("Classification Report:\n")
print(classification_rep)
print("\n\nFeature Importances:\n")
print(final_feature_importances.sort_values(ascending=False).to_string())
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
Best Model Results
===================
Accuracy: 0.68
Best Hyperparameters: {'max_depth': None, 'min_samples_leaf': 1, 'min_samples_split': 5, 'n_estimators': 200}

Best Features: f1 Peak Frequency (Hz), f1 A Parameter, f1 Conversion Factor (V m^-1)
Detailed Model Results

=======================

Accuracy: 0.68


Classification Report:

                       precision    recall  f1-score   support

25A 1000uM ZnCl2 SiNP       0.20      0.25      0.22         4
25T 1000uM ZnCl2 SiNP       0.88      0.78      0.82         9
                 SiNP       0.78      0.78      0.78         9

             accuracy                           0.68        22
            macro avg       0.62      0.60      0.61        22
         weighted avg       0.71      0.68      0.70        22



Feature Importances:

f1 Peak Frequency (Hz)           0.364417
f1 A Parameter                   0.338019
f1 Conversion Factor (V m^-1)    0.297565

T25_dataΒΆ

InΒ [27]:
# Reading the T25_data.csv
T25_data_with_outliers = pd.read_csv("T25_data_with_outliers.csv")

# Preprocessing the data by features and target variable
X = T25_data_with_outliers.drop('Particle Type', axis=1)
y = T25_data_with_outliers['Particle Type']

# Splitting the data into 80% training and 20% testing
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# The parameter grid to iterate through to find the best random forest model
param_grid = {
    'n_estimators': [50, 100, 200],
    'max_depth': [None, 10, 20, 30],
    'min_samples_split': [2, 5, 10],
    'min_samples_leaf': [1, 2, 4]
}

# Initial random forest training and feature importance evaluation
clf = RandomForestClassifier(random_state=42)
clf.fit(X_train, y_train)

# Getting feature importance scores
feature_importances = pd.Series(clf.feature_importances_, index=X.columns).sort_values(ascending=False)
top_features = feature_importances.index.tolist()

# Selecting top features and tuning hyperparameters
best_accuracy = 0
best_params = None
best_features = None

# Initialising the progress bar
total_iterations = len(top_features)
with tqdm(total=total_iterations, desc="Feature Subset Iterations", unit="features") as pbar:
    for num_features in range(1, len(top_features) + 1):
        selected_features = top_features[:num_features]
        X_train_subset = X_train[selected_features]
        X_test_subset = X_test[selected_features]

        # Performing GridSearchCV for hyperparameter tuning
        grid_search = GridSearchCV(estimator=RandomForestClassifier(random_state=42), param_grid=param_grid, cv=3, scoring='accuracy', n_jobs=-1)
        grid_search.fit(X_train_subset, y_train)

        # Evaluating the test set
        y_pred = grid_search.best_estimator_.predict(X_test_subset)
        accuracy = accuracy_score(y_test, y_pred)

        # Checking if this is the best model based upon accuracy score
        if accuracy > best_accuracy:
            best_accuracy = accuracy
            best_params = grid_search.best_params_
            best_features = selected_features

        # Updating the progress bar
        pbar.update(1)

# Training the best model and generate additional outputs
best_clf = RandomForestClassifier(**best_params, random_state=42)
best_clf.fit(X_train[best_features], y_train)
y_pred = best_clf.predict(X_test[best_features])
classification_rep = classification_report(y_test, y_pred)

# Saving feature importances and classification report
final_feature_importances = pd.Series(best_clf.feature_importances_, index=best_features)

# Binarizing the target labels for multiclass ROC curve
lb = LabelBinarizer()
y_train_bin = lb.fit_transform(y_train)
y_test_bin = lb.transform(y_test)

# Getting predicted probabilities for each class
y_prob = best_clf.predict_proba(X_test[best_features])

# Generating ROC variables and AUC for each class
fpr = {}
tpr = {}
roc_auc = {}

# Iterating through each class and plot the ROC curve
for i, particle_label in enumerate(lb.classes_):
    fpr[i], tpr[i], _ = roc_curve(y_test_bin[:, i], y_prob[:, i])
    roc_auc[i] = auc(fpr[i], tpr[i])
Feature Subset Iterations: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 26/26 [08:04<00:00, 18.64s/features]
InΒ [28]:
# Generating the confusion matrix
plt.figure(figsize=(6, 5), dpi=300)
cm = confusion_matrix(y_test, y_pred)

# Formatting the labels with a space halfway through multi-word labels
def format_label(label):
    label = label.replace("uM", " ΞΌM").replace("ZnCl2", "ZnCl$_2$").replace("SiNP", "SiNPs")
    words = label.split()
    if len(words) > 1:
        mid_point = 3
        return ' '.join(words[:mid_point]) + '\n' + ' '.join(words[mid_point:])
    return label

formatted_labels = [format_label(label) for label in lb.classes_]

ax = sns.heatmap(cm, annot=True, fmt="d", cmap="Blues", 
                 xticklabels=formatted_labels, 
                 yticklabels=formatted_labels)
colorbar = ax.collections[0].colorbar
colorbar.set_label('Number of Test SiNPs')
max_val = np.max(cm)
colorbar.set_ticks(np.arange(0, max_val + 1, 1))
plt.xlabel("Predicted")
plt.ylabel("Actual")
plt.title("Confusion Matrix")
plt.xticks(rotation=0)
plt.yticks(rotation=90)
plt.show()

# Outputting the classification report and ranked feature importance as a dataframe
report_dict = classification_report(y_test, y_pred, output_dict=True)
report_df = pd.DataFrame(report_dict).transpose()

# Calculating and ranking feature importance
final_feature_importances = pd.Series(best_clf.feature_importances_, index=best_features).sort_values(ascending=False)
feature_importance_df = pd.DataFrame({'Feature': final_feature_importances.index, 'Importance': final_feature_importances.values})

# Replacing "uM" with " ΞΌM" in feature names
feature_importance_df['Feature'] = feature_importance_df['Feature'].str.replace("uM", " ΞΌM").str.replace("ZnCl2", "ZnCl$_2$")

# Plotting feature importance
plt.figure(figsize=(12, 6), dpi=300)
final_feature_importances.plot(kind='barh', color='royalblue')
plt.xlabel("Feature Importance Score")
plt.ylabel("Feature")
plt.title("Feature Importance")
plt.show()

# Plotting ROC Curves as 3D plots
fig = plt.figure(figsize=(12, 8), dpi=300)
ax = fig.add_subplot(111, projection='3d')

# Colours for ROC curves
colors = ['blue', 'turquoise', 'purple', 'orange']  # Extend if more classes exist

for i, particle_label in enumerate(lb.classes_):
    formatted_label = particle_label.replace("uM", " ΞΌM").replace("ZnCl2", "ZnCl$_2$").replace("SiNP", "SiNPs")
    ax.plot(fpr[i], tpr[i], zs=i, zdir='y', color=colors[i % len(colors)], label=f"{formatted_label} (AUC = {roc_auc[i]:.2f})")

# Adding diagonal dashed lines for chance score
chance_x = np.linspace(0, 1, 100)
chance_y = np.linspace(0, 1, 100)
for i in range(len(lb.classes_)):
    ax.plot(chance_x, chance_y, zs=i, zdir='y', linestyle='dashed', color='black', label='Chance' if i == 0 else "")

ax.set_xlabel("False Positive Rate")
ax.set_zlabel("True Positive Rate")
ax.set_title("ROC Curves for Each Particle Type")
ax.legend(loc="upper left")
ax.set_yticks([])
ax.set_box_aspect(None, zoom=0.9)
plt.show()

# Printing the best model results
print("Best Model Results")
print("===================")
print(f"Accuracy: {best_accuracy:.2f}")
print(f"Best Hyperparameters: {best_params}\n")
print(f"Best Features: {', '.join(best_features)}")

# Printing the detailed model results
print("Detailed Model Results\n")
print("=======================\n")
print(f"Accuracy: {best_accuracy:.2f}\n\n")
print("Classification Report:\n")
print(classification_rep)
print("\n\nFeature Importances:\n")
print(final_feature_importances.sort_values(ascending=False).to_string())
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
Best Model Results
===================
Accuracy: 0.55
Best Hyperparameters: {'max_depth': None, 'min_samples_leaf': 2, 'min_samples_split': 5, 'n_estimators': 200}

Best Features: f3 A Parameter, f2 A Parameter, f1 A Parameter, f2 Signal-to-noise Value, f1 Peak Frequency (Hz), f3 Signal-to-noise Value, f2 Peak Frequency (Hz), f1 Signal-to-noise Value, f3 Conversion Factor (V m^-1), f1 Conversion Factor (V m^-1), f3 Damping, f3 Peak Frequency (Hz)
Detailed Model Results

=======================

Accuracy: 0.55


Classification Report:

                       precision    recall  f1-score   support

25T 1000uM ZnCl2 SiNP       0.83      0.83      0.83         6
 25T 100uM ZnCl2 SiNP       0.67      0.50      0.57         4
 25T 500uM ZnCl2 SiNP       0.67      0.44      0.53         9
 25T 750uM ZnCl2 SiNP       0.14      0.33      0.20         3

             accuracy                           0.55        22
            macro avg       0.58      0.53      0.53        22
         weighted avg       0.64      0.55      0.58        22



Feature Importances:

f3 A Parameter                   0.159886
f2 A Parameter                   0.131673
f1 A Parameter                   0.102869
f1 Peak Frequency (Hz)           0.080818
f2 Peak Frequency (Hz)           0.076410
f1 Signal-to-noise Value         0.070008
f3 Peak Frequency (Hz)           0.069633
f1 Conversion Factor (V m^-1)    0.067318
f2 Signal-to-noise Value         0.065018
f3 Damping                       0.063214
f3 Conversion Factor (V m^-1)    0.057901
f3 Signal-to-noise Value         0.055252

Frequency comparison chartsΒΆ

Note:

  • The datasets here are provided in the repository as a compressed .7z file. Please extract these into a folder named compare_peaks_data.
InΒ [29]:
# Loading the Rhode and Schwarz .bin oscilloscope data file
def get_bin_data(file):
    y, time_data, save_info = RTxReadBin(file)
    voltages = y[:, 0, 0]
    time_start = save_info["XStart"]
    time_stop = save_info["XStop"]
    signal_record_length = save_info["SignalRecordLength"]
    time_step = (time_stop - time_start) / signal_record_length
    sample_frequency = 1 / time_step
    data = oa.load_voltage_data(voltages, SampleFreq = sample_frequency, timeStart = time_start)
    return data
InΒ [30]:
# Loading the sample compare_peaks.csv table
compare_peaks_table = pd.read_csv("compare_peaks.csv")
compare_peaks_table.head()
Out[30]:
SiNP f1 SiNP Peak Freq (Hz) f2 SiNP Peak Freq (Hz) f3 SiNP Peak Freq (Hz) 1000uM ZnCl2 f1 1000uM ZnCl2 Peak Freq (Hz) f2 1000uM ZnCl2 Peak Freq (Hz) f3 1000uM ZnCl2 Peak Freq (Hz) 25A 1000uM ZnCl2 f1 25A 1000uM ZnCl2 Peak Freq (Hz) ... f2 25T 100uM ZnCl2 Peak Freq (Hz) f3 25T 100uM ZnCl2 Peak Freq (Hz) 25T 500uM ZnCl2 f1 25T 500uM ZnCl2 Peak Freq (Hz) f2 25T 500uM ZnCl2 Peak Freq (Hz) f3 25T 500uM ZnCl2 Peak Freq (Hz) 25T 750uM ZnCl2 f1 25T 750uM ZnCl2 Peak Freq (Hz) f2 25T 750uM ZnCl2 Peak Freq (Hz) f3 25T 750uM ZnCl2 Peak Freq (Hz)
0 compare_peaks_data/RefCurve_2025-01-07_1.Wfm.bin 134922 211863 239476 compare_peaks_data/RefCurve_2025-01-08_3.Wfm.bin 138675 214200 240542 compare_peaks_data/RefCurve_2025-01-09_3.Wfm.bin 135659 ... 196562 216526 compare_peaks_data/RefCurve_2025-01-15_3.Wfm.bin 138160 213057 239462 compare_peaks_data/RefCurve_2025-01-17_2.Wfm.bin 134625 206953 232033
1 compare_peaks_data/RefCurve_2025-01-07_2.Wfm.bin 139785 213320 239286 compare_peaks_data/RefCurve_2025-01-08_4.Wfm.bin 134765 214006 242753 compare_peaks_data/RefCurve_2025-01-09_4.Wfm.bin 137007 ... 198129 217631 compare_peaks_data/RefCurve_2025-01-15_5.Wfm.bin 138036 210859 236170 compare_peaks_data/RefCurve_2025-01-17_1.Wfm.bin 132480 204808 228050
2 compare_peaks_data/RefCurve_2025-01-07_6.Wfm.bin 136054 209189 234708 compare_peaks_data/RefCurve_2025-01-08_6.Wfm.bin 137234 215233 242930 compare_peaks_data/RefCurve_2025-01-09_9.Wfm.bin 135581 ... 199125 218956 compare_peaks_data/RefCurve_2025-01-15_9.Wfm.bin 136442 208835 234303 compare_peaks_data/RefCurve_2025-01-17_5.Wfm.bin 133612 207739 233430
3 compare_peaks_data/RefCurve_2025-01-07_8.Wfm.bin 130264 199700 220940 compare_peaks_data/RefCurve_2025-01-08_11.Wfm.bin 135633 212475 239964 compare_peaks_data/RefCurve_2025-01-09_14.Wfm.bin 136459 ... 199098 219210 compare_peaks_data/RefCurve_2025-01-15_10.Wfm.bin 137352 208791 233440 compare_peaks_data/RefCurve_2025-01-17_8.Wfm.bin 129774 201465 225190
4 compare_peaks_data/RefCurve_2025-01-07_11.Wfm.bin 137230 203313 224191 compare_peaks_data/RefCurve_2025-01-08_14.Wfm.bin 138198 216454 244179 compare_peaks_data/RefCurve_2025-01-09_16.Wfm.bin 135945 ... 202947 226619 compare_peaks_data/RefCurve_2025-01-15_14.Wfm.bin 133368 193811 211430 compare_peaks_data/RefCurve_2025-01-17_13.Wfm.bin 136882 198773 217662

5 rows Γ— 28 columns

InΒ [31]:
# Getting the limits for the f1 frequency values with a 2000 Hz buffer on either side
def get_f1_limits(row):
    columns =["f1 SiNP Peak Freq (Hz)", "f1 1000uM ZnCl2 Peak Freq (Hz)",
              "f1 25A 1000uM ZnCl2 Peak Freq (Hz)", "f1 25T 1000uM ZnCl2 Peak Freq (Hz)",
              "f1 25T 100uM ZnCl2 Peak Freq (Hz)", "f1 25T 500uM ZnCl2 Peak Freq (Hz)",
              "f1 25T 750uM ZnCl2 Peak Freq (Hz)"]
    min_value = row[columns].min() - 2000
    max_value = row[columns].max() + 2000
    return min_value, max_value

# Getting the limits for the f2 frequency values with a 2000 Hz buffer on either side
def get_f2_limits(row):
    columns =["f2 SiNP Peak Freq (Hz)", "f2 1000uM ZnCl2 Peak Freq (Hz)",
              "f2 25A 1000uM ZnCl2 Peak Freq (Hz)", "f2 25T 1000uM ZnCl2 Peak Freq (Hz)",
              "f2 25T 100uM ZnCl2 Peak Freq (Hz)", "f2 25T 500uM ZnCl2 Peak Freq (Hz)",
              "f2 25T 750uM ZnCl2 Peak Freq (Hz)"]
    min_value = row[columns].min() - 2000
    max_value = row[columns].max() + 2000
    return min_value, max_value

# Getting the limits for the f3 frequency values with a 2000 Hz buffer on either side
def get_f3_limits(row):
    columns =["f3 SiNP Peak Freq (Hz)", "f3 1000uM ZnCl2 Peak Freq (Hz)",
              "f3 25A 1000uM ZnCl2 Peak Freq (Hz)", "f3 25T 1000uM ZnCl2 Peak Freq (Hz)",
              "f3 25T 100uM ZnCl2 Peak Freq (Hz)", "f3 25T 500uM ZnCl2 Peak Freq (Hz)",
              "f3 25T 750uM ZnCl2 Peak Freq (Hz)"]
    min_value = row[columns].min() - 2000
    max_value = row[columns].max() + 2000
    return min_value, max_value
InΒ [32]:
# Plotting the data with all particle types included
def plot_all_peaks(SiNP_data, ZnCl2_data, A1000_data, T1000_data, 
                     T100_data, T500_data, T750_data, f1_min_value, f1_max_value,
                     f2_min_value, f2_max_value, f3_min_value, f3_max_value):

    # Getting the frequency and PSD for each particle type
    SiNP_freq, SiNP_psd = SiNP_data.get_PSD()
    ZnCl2_freq, ZnCl2_psd = ZnCl2_data.get_PSD()
    A1000_freq, A1000_psd = A1000_data.get_PSD()
    T1000_freq, T1000_psd = T1000_data.get_PSD()
    T100_freq, T100_psd = T100_data.get_PSD()
    T500_freq, T500_psd = T500_data.get_PSD()
    T750_freq, T750_psd = T750_data.get_PSD()

    # Setting up the plot
    fig, axes = plt.subplots(1, 3, figsize=(15, 5), dpi=300)

    # Plotting the f1 peak in the first window
    axes[0].plot(SiNP_freq, SiNP_psd, color="gray", label="Standard SiNP")
    axes[0].plot(ZnCl2_freq, ZnCl2_psd, color="green", label="1000 ΞΌM ZnCl$_2$ SiNP")
    axes[0].plot(A1000_freq, A1000_psd, color="red", label="25A 1000 ΞΌM ZnCl$_2$ SiNP")
    axes[0].plot(T1000_freq, T1000_psd, color="blue", label="25T 1000 ΞΌM ZnCl$_2$ SiNP")
    axes[0].plot(T100_freq, T100_psd, color="turquoise", label="25T 100 ΞΌM ZnCl$_2$ SiNP")
    axes[0].plot(T500_freq, T500_psd, color="purple", label="25T 500 ΞΌM ZnCl$_2$ SiNP")
    axes[0].plot(T750_freq, T750_psd, color="orange", label="25T 750 ΞΌM ZnCl$_2$ SiNP")    
    axes[0].set_xlim(f1_min_value, f1_max_value)
    axes[0].set_yscale("log")
    axes[0].set_xlabel("Frequency (Hz)")
    axes[0].set_ylabel("$S_{xx}$ ($V^2/Hz$)")
    axes[0].set_title("Plot of all f1 Peaks")
    axes[0].grid(True)
    axes[0].legend(loc="lower right")

    # Plotting the f2 peak in the second window
    axes[1].plot(SiNP_freq, SiNP_psd, color="gray", label="Standard SiNP")
    axes[1].plot(ZnCl2_freq, ZnCl2_psd, color="green", label="1000 ΞΌM ZnCl$_2$ SiNP")
    axes[1].plot(A1000_freq, A1000_psd, color="red", label="25A 1000 ΞΌM ZnCl$_2$ SiNP")
    axes[1].plot(T1000_freq, T1000_psd, color="blue", label="25T 1000 ΞΌM ZnCl$_2$ SiNP")
    axes[1].plot(T100_freq, T100_psd, color="turquoise", label="25T 100 ΞΌM ZnCl$_2$ SiNP")
    axes[1].plot(T500_freq, T500_psd, color="purple", label="25T 500 ΞΌM ZnCl$_2$ SiNP")
    axes[1].plot(T750_freq, T750_psd, color="orange", label="25T 750 ΞΌM ZnCl$_2$ SiNP")    
    axes[1].set_xlim(f2_min_value, f2_max_value)
    axes[1].set_yscale("log")
    axes[1].set_xlabel("Frequency (Hz)")
    axes[1].set_ylabel("$S_{xx}$ ($V^2/Hz$)")
    axes[1].set_title("Plot of all f2 Peaks")
    axes[1].grid(True)
    axes[1].legend(loc="lower right")

    # Plotting the f3 peak in the third window
    axes[2].plot(SiNP_freq, SiNP_psd, color="gray", label="Standard SiNP")
    axes[2].plot(ZnCl2_freq, ZnCl2_psd, color="green", label="1000 ΞΌM ZnCl$_2$ SiNP")
    axes[2].plot(A1000_freq, A1000_psd, color="red", label="25A 1000 ΞΌM ZnCl$_2$ SiNP")
    axes[2].plot(T1000_freq, T1000_psd, color="blue", label="25T 1000 ΞΌM ZnCl$_2$ SiNP")
    axes[2].plot(T100_freq, T100_psd, color="turquoise", label="25T 100 ΞΌM ZnCl$_2$ SiNP")
    axes[2].plot(T500_freq, T500_psd, color="purple", label="25T 500 ΞΌM ZnCl$_2$ SiNP")
    axes[2].plot(T750_freq, T750_psd, color="orange", label="25T 750 ΞΌM ZnCl$_2$ SiNP")    
    axes[2].set_xlim(f3_min_value, f3_max_value)
    axes[2].set_yscale("log")
    axes[2].set_xlabel("Frequency (Hz)")
    axes[2].set_ylabel("$S_{xx}$ ($V^2/Hz$)")
    axes[2].set_title("Plot of all f3 Peaks")
    axes[2].grid(True)
    axes[2].legend(loc="lower right")

    # Displaying the plot
    plt.tight_layout()
    plt.show()

    return
InΒ [33]:
# Plotting the data with only 25T particle types included
def plot_T_peaks(T1000_data, T100_data, T500_data, T750_data, f1_min_value, f1_max_value,
                     f2_min_value, f2_max_value, f3_min_value, f3_max_value):

    # Getting the frequency and PSD for each particle type
    T1000_freq, T1000_psd = T1000_data.get_PSD()
    T100_freq, T100_psd = T100_data.get_PSD()
    T500_freq, T500_psd = T500_data.get_PSD()
    T750_freq, T750_psd = T750_data.get_PSD()

    # Setting up the plot
    fig, axes = plt.subplots(1, 3, figsize=(15, 5), dpi=300)

    # Plotting the f1 peak in the first window
    axes[0].plot(T1000_freq, T1000_psd, color="blue", label="25T 1000 ΞΌM ZnCl$_2$ SiNP")
    axes[0].plot(T100_freq, T100_psd, color="turquoise", label="25T 100 ΞΌM ZnCl$_2$ SiNP")
    axes[0].plot(T500_freq, T500_psd, color="purple", label="25T 500 ΞΌM ZnCl$_2$ SiNP")
    axes[0].plot(T750_freq, T750_psd, color="orange", label="25T 750 ΞΌM ZnCl$_2$ SiNP")    
    axes[0].set_xlim(f1_min_value, f1_max_value)
    axes[0].set_yscale("log")
    axes[0].set_xlabel("Frequency (Hz)")
    axes[0].set_ylabel("$S_{xx}$ ($V^2/Hz$)")
    axes[0].set_title("Plot of 25T f1 Peaks")
    axes[0].grid(True)
    axes[0].legend(loc="lower right")

    # Plotting the f2 peak in the second window
    axes[1].plot(T1000_freq, T1000_psd, color="blue", label="25T 1000 ΞΌM ZnCl$_2$ SiNP")
    axes[1].plot(T100_freq, T100_psd, color="turquoise", label="25T 100 ΞΌM ZnCl$_2$ SiNP")
    axes[1].plot(T500_freq, T500_psd, color="purple", label="25T 500 ΞΌM ZnCl$_2$ SiNP")
    axes[1].plot(T750_freq, T750_psd, color="orange", label="25T 750 ΞΌM ZnCl$_2$ SiNP")    
    axes[1].set_xlim(f2_min_value, f2_max_value)
    axes[1].set_yscale("log")
    axes[1].set_xlabel("Frequency (Hz)")
    axes[1].set_ylabel("$S_{xx}$ ($V^2/Hz$)")
    axes[1].set_title("Plot of 25T f2 Peaks")
    axes[1].grid(True)
    axes[1].legend(loc="lower right")

    # Plotting the f3 peak in the third window
    axes[2].plot(T1000_freq, T1000_psd, color="blue", label="25T 1000 ΞΌM ZnCl$_2$ SiNP")
    axes[2].plot(T100_freq, T100_psd, color="turquoise", label="25T 100 ΞΌM ZnCl$_2$ SiNP")
    axes[2].plot(T500_freq, T500_psd, color="purple", label="25T 500 ΞΌM ZnCl$_2$ SiNP")
    axes[2].plot(T750_freq, T750_psd, color="orange", label="25T 750 ΞΌM ZnCl$_2$ SiNP")    
    axes[2].set_xlim(f3_min_value, f3_max_value)
    axes[2].set_yscale("log")
    axes[2].set_xlabel("Frequency (Hz)")
    axes[2].set_ylabel("$S_{xx}$ ($V^2/Hz$)")
    axes[2].set_title("Plot of 25T f3 Peaks")
    axes[2].grid(True)
    axes[2].legend(loc="lower right")

    # Displaying the plot
    plt.tight_layout()
    plt.show()

    return
InΒ [34]:
# Plotting the data with only 25A, 25T and SiNP particle types included
def plot_SiNP_A_T_peaks(T1000_data, A1000_data, SiNP_data, f1_min_value, f1_max_value,
                     f2_min_value, f2_max_value, f3_min_value, f3_max_value):

    # Getting the frequency and PSD for each particle type
    T1000_freq, T1000_psd = T1000_data.get_PSD()
    A100_freq, A100_psd = A1000_data.get_PSD()
    SiNP_freq, SiNP_psd = SiNP_data.get_PSD()

    # Setting up the plot
    fig, axes = plt.subplots(1, 3, figsize=(15, 5), dpi=300)

    # Plotting the f1 peak in the first window
    axes[0].plot(T1000_freq, T1000_psd, color="blue", label="25T 1000 ΞΌM ZnCl$_2$ SiNP")
    axes[0].plot(A100_freq, A100_psd, color="red", label="25A 1000 ΞΌM ZnCl$_2$ SiNP")
    axes[0].plot(SiNP_freq, SiNP_psd, color="gray", label="Standard SiNP") 
    axes[0].set_xlim(f1_min_value, f1_max_value)
    axes[0].set_yscale("log")
    axes[0].set_xlabel("Frequency (Hz)")
    axes[0].set_ylabel("$S_{xx}$ ($V^2/Hz$)")
    axes[0].set_title("Plot of f1 Peaks")
    axes[0].grid(True)
    axes[0].legend(loc="lower right")

    # Plotting the f2 peak in the second window
    axes[1].plot(T1000_freq, T1000_psd, color="blue", label="25T 1000 ΞΌM ZnCl$_2$ SiNP")
    axes[1].plot(A100_freq, A100_psd, color="red", label="25A 1000 ΞΌM ZnCl$_2$ SiNP")
    axes[1].plot(SiNP_freq, SiNP_psd, color="gray", label="Standard SiNP")     
    axes[1].set_xlim(f2_min_value, f2_max_value)
    axes[1].set_yscale("log")
    axes[1].set_xlabel("Frequency (Hz)")
    axes[1].set_ylabel("$S_{xx}$ ($V^2/Hz$)")
    axes[1].set_title("Plot of f2 Peaks")
    axes[1].grid(True)
    axes[1].legend(loc="lower right")

    # Plotting the f3 peak in the third window
    axes[2].plot(T1000_freq, T1000_psd, color="blue", label="25T 1000 ΞΌM ZnCl$_2$ SiNP")
    axes[2].plot(A100_freq, A100_psd, color="red", label="25A 1000 ΞΌM ZnCl$_2$ SiNP")
    axes[2].plot(SiNP_freq, SiNP_psd, color="gray", label="Standard SiNP")    
    axes[2].set_xlim(f3_min_value, f3_max_value)
    axes[2].set_yscale("log")
    axes[2].set_xlabel("Frequency (Hz)")
    axes[2].set_ylabel("$S_{xx}$ ($V^2/Hz$)")
    axes[2].set_title("Plot of f3 Peaks")
    axes[2].grid(True)
    axes[2].legend(loc="lower right")

    # Displaying the plot
    plt.tight_layout()
    plt.show()

    return
InΒ [35]:
# Defining the function to run the code
def run_functions(row_number):
    # Getting the data from one row
    row_data = compare_peaks_table.iloc[row_number]

    # Getting the frequency peak min and max values
    f1_min_value, f1_max_value = get_f1_limits(row_data)
    f2_min_value, f2_max_value = get_f2_limits(row_data)
    f3_min_value, f3_max_value = get_f3_limits(row_data)
    
    # Getting the data for the particular row
    SiNP_data = get_bin_data(compare_peaks_table.loc[row_number, "SiNP"])
    ZnCl2_data = get_bin_data(compare_peaks_table.loc[row_number, "1000uM ZnCl2"])
    A1000_data = get_bin_data(compare_peaks_table.loc[row_number, "25A 1000uM ZnCl2"])
    T1000_data = get_bin_data(compare_peaks_table.loc[row_number, "25T 1000uM ZnCl2"])
    T100_data = get_bin_data(compare_peaks_table.loc[row_number, "25T 100uM ZnCl2"])
    T500_data = get_bin_data(compare_peaks_table.loc[row_number, "25T 500uM ZnCl2"])
    T750_data = get_bin_data(compare_peaks_table.loc[row_number, "25T 750uM ZnCl2"])

    # Running the plot all peaks function
    plot_all_peaks(SiNP_data, ZnCl2_data, A1000_data, T1000_data, T100_data,
                   T500_data, T750_data, f1_min_value, f1_max_value, 
                   f2_min_value, f2_max_value, f3_min_value, f3_max_value)
    
    # Running the plot T peaks function
    plot_T_peaks(T1000_data, T100_data, T500_data, T750_data, f1_min_value, f1_max_value,
                 f2_min_value, f2_max_value, f3_min_value, f3_max_value)
    
    # Running the plot SiNP A T peaks function
    plot_SiNP_A_T_peaks(T1000_data, A1000_data, SiNP_data, f1_min_value, f1_max_value,
                     f2_min_value, f2_max_value, f3_min_value, f3_max_value)

    return
    
InΒ [36]:
# Getting the plots for row 0
run_functions(0)
calcPSD:  True
running self.get_PSD
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
Calculating power spectral density
Calculating power spectral density
Calculating power spectral density
Calculating power spectral density
Calculating power spectral density
Calculating power spectral density
No description has been provided for this image
Calculating power spectral density
Calculating power spectral density
Calculating power spectral density
Calculating power spectral density
No description has been provided for this image
Calculating power spectral density
Calculating power spectral density
Calculating power spectral density
No description has been provided for this image
InΒ [37]:
# Getting the plots for row 1
run_functions(1)
calcPSD:  True
running self.get_PSD
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
Calculating power spectral density
Calculating power spectral density
Calculating power spectral density
Calculating power spectral density
Calculating power spectral density
Calculating power spectral density
No description has been provided for this image
Calculating power spectral density
Calculating power spectral density
Calculating power spectral density
Calculating power spectral density
No description has been provided for this image
Calculating power spectral density
Calculating power spectral density
Calculating power spectral density
No description has been provided for this image
InΒ [38]:
# Getting the plots for row 2
run_functions(2)
calcPSD:  True
running self.get_PSD
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
Calculating power spectral density
Calculating power spectral density
Calculating power spectral density
Calculating power spectral density
Calculating power spectral density
Calculating power spectral density
No description has been provided for this image
Calculating power spectral density
Calculating power spectral density
Calculating power spectral density
Calculating power spectral density
No description has been provided for this image
Calculating power spectral density
Calculating power spectral density
Calculating power spectral density
No description has been provided for this image
InΒ [39]:
# Getting the plots for row 3
run_functions(3)
calcPSD:  True
running self.get_PSD
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
Calculating power spectral density
Calculating power spectral density
Calculating power spectral density
Calculating power spectral density
Calculating power spectral density
Calculating power spectral density
No description has been provided for this image
Calculating power spectral density
Calculating power spectral density
Calculating power spectral density
Calculating power spectral density
No description has been provided for this image
Calculating power spectral density
Calculating power spectral density
Calculating power spectral density
No description has been provided for this image
InΒ [40]:
# Getting the plots for row 4
run_functions(4)
calcPSD:  True
running self.get_PSD
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
calcPSD:  True
running self.get_PSD
Calculating power spectral density
Calculating power spectral density
Calculating power spectral density
Calculating power spectral density
Calculating power spectral density
Calculating power spectral density
Calculating power spectral density
Calculating power spectral density
No description has been provided for this image
Calculating power spectral density
Calculating power spectral density
Calculating power spectral density
Calculating power spectral density
No description has been provided for this image
Calculating power spectral density
Calculating power spectral density
Calculating power spectral density
No description has been provided for this image

ENDΒΆ